Gold Medal Software 2

home *** CD-ROM | disk | FTP | other *** search

/ Gold Medal Software 2 / Gold Medal Software Volume 2 (Gold Medal) (1994).iso / comms / uu21.arj / UU.DOC < prev

Wrap

Text File | 1994-01-17 | 28KB | 509 lines

UU version 2.1 -- A small, fast, and smart uudecoder (C) January 1994 -- ir.drs. B.J. Walbeehm January 17, 1994 Introduction ~~~~~~~~~~~~ Until further notice, whenever I have a new version of UU, I shall upload it to the FTP site wuarchive.wustl.edu (directory: /pub/MSDOS_UPLOADS/uucode), as well as to the alt.binaries.pictures.misc and alt.binaries.pictures.utilities newsgroups on USENET. UU is a freeware program; please read the file INFO.TXT for more information on what I mean by this. If the file INFO.TXT was not included in your UU package, then you can obtain it by e-mailing (which is preferred), writing, or calling (which is least preferred) me; the addresses may be found at the end of this file. In short, the only thing I ask from you when you decide that this program is of use to you, is that you send me an e-mail. I have written this program primarily for my own convenience; the first time I downloaded (a lot of) uuencoded files from the USENET binaries, it took me over four hours to edit everything in such a way that the only uudecoder I had then (a very naive one) could process them. That was a once-but-never- again experience. Starting with this program, I have broken with my rule to write programs that run even on an 8086 based machine. The reason is that (as I said) I write my programs first and foremost for myself, and since I "never" use an 8086 ... But I can easily convert this program to an 8086 compatible version, and on popular demand, I may even be willing to do this. Just let me know if you desperately want an 8086 compatible version. For all clarity: UU version 2.1 requires an 80286 or higher. I have not yet figured out what the minimal DOS version is that this program requires. (I am currently using MS-DOS 6.20, and I do not have versions of MS-DOS lying around lower than 5.00.) Anyway, I am quite sure that UU also runs on "very low" DOS versions. I learnt that there still are people using an 8088 based machine ... are there actually still people using, say, MS-DOS 3.00 or below? Or are these versions extinct? As for memory requirements: The amount of RAM free for executables should be at least 65k (UU uses two 28k buffers to speed up reading and writing) for this program to work correctly. UU will check if there is enough RAM free, and complain if there is not. (I hear some people asking: "65k?" ... Yes, I know we are talking .COM here, but that does NOT mean we are restricted to 64k now, does it?) As with all the programs I write, a short usage message is included in UU. This message may be displayed by entering either of the following three commands: UU /? UU -? TYPE UU.COM Starting with version 2.0, UU no longer displays a usage message when one merely enters "UU". The reason for this is that I think that one should never get accustomed to invoking a program without parameters or switches just to get help, for there are numerous programs that really do something then. In fact, I have written a program ("REMDIR.EXE") that can (depending on whether one really wants it to do what it does then) have disastrous effects then. What I am trying to say is: Never rely on a program to give you help by invoking it without any parameters or switches ... On the uuencoding standard ~~~~~~~~~~~~~~~~~~~~~~~~~~ In my opinion, the uuencoding standard is not very well thought-out. As long as an encoded file consists of only one section (in the early days, splitting an encoded file up into more than one section was most probably not allowed), there is not much wrong with the standard, but as soon as the necessity rose for files to be split up, the standard should have been changed as well. To start with, there is no standard way of designating non-section parts, so the standard provides us with no means whatsoever to distinguish between encoded sections and mere comments. Also, the standard does not describe a way of deciding which sections belong together, nor in which order. Most uuencoders put such additional information in the files, but with the lack of a standard, almost every single one of them has its own way of doing this. A number of encoders will also put one or more checksums in the file, but again, this has not been standardised. It would have been very easy to devise a standard for adding such additional information, but it has not been done, and it may be far too late now ... Command line parameters and switches ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Although the usage message says "UU [drive:][path]filename[.ext] [/I] [/S]", UU allows all kinds of variations on this: Instead of a slash ("/"), a dash ("-") is accepted as well. UU of course accepts both uppercase and lowercase, and ignores irrelevant blanks (spaces). Also, using a switch twice or more has the same effect as using it only once. Moreover, switches (currently, the switches are "I" and "S") may be combined, and the order in which the filename and the switches (if any) appear on the command line is irrelevant. This means that, for instance, all of the following commands are treated identically: UU example.uue /I /S UU example.uue -I -S Uu exAmplE.Uue/s -I uu/s example.uue/i uu example.uue -is uu /is example.uue uu example.uue /s/i uu/i -sisssis example.uue Please note that if the dash ("-") is used to precede a switch, it must be preceded by at least one blank, since DOS allows dashes also to be part of a filename (EXCEPT as the LEADING character of a filename). This means that the following two commands are NOT identical: uu temp-i uu temp/i The former command processes a file called "temp-i" using no switches, while the latter will use the switch "i" on a file called "temp". So if the latter interpretation is meant, and one wants to use the dash, then make sure that at least one blank precedes it, as in: uu temp -i What UU does, and does not do ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Unlike what I have seen in some other uudecoders, UU does NOT assume an extension of .UUE if no extension is given. (Let me know if this bothers you.) This is for my own convenience, since most of the files I get to process have no extension. The current version of UU does not allow encoded files to be split up into different files, so if, for example, a file called EXAMPLE.EXE has been converted (by some uuencoder) into three sections, and each section has been written to a different file, say EXAMPLE1.UUE, EXAMPLE2.UUE, and EXAMPLE3.UUE, then UU will not be able to retrieve the original file. Until I have implemented a multiple source handler, one can work around this restriction by first executing the following command (still using the same example) from the DOS prompt: COPY /b EXAMPLE1.UUE + EXAMPLE2.UUE + EXAMPLE3.UUE EXAMPLE.TMP and then feeding the resulting combined file EXAMPLE.TMP to UU. Note that the /b switch has been added just to be sure; if the source files come directly from a (any) uuencoder, then it will not be necessary, but in that case it will not harm either. Some posting programs, however, put a CTRL-Z character in the file, in which case the /b switch is absolutely required. Please note also that if (and only if) the (here) three source files appear in increasing order in the directory (so EXAMPLE1.UUE comes before EXAMPLE2.UUE, which in turn comes before EXAMPLE3.UUE), that the following DOS command will correctly combine them as well: COPY /b EXAMPLE*.UUE EXAMPLE.TMP The restriction of the files appearing in increasing order in the directory when using the latter COPY command does usually not apply when UU is used in its "unsorted sections" mode on the resulting file. For more information on unsorted sections, see the appropriate chapter in this manual. Please note that in order for the COPY command to work correctly, the resulting file (EXAMPLE.TMP in the above examples) should have an extension that differs from any of the files that are to be concatenated (the files ending in .UUE in the above examples). If no switches are used (and ONLY then), UU does not allow sections to be in any other than increasing order in the file. (Please refer to the chapter on unsorted sections for information on how to handle these.) In particular, this means that this version acts the same as the earlier 1.x versions in case no switches are used. In this mode, the 2.x versions are still as fast as UU version 1.3 (which is the fastest of the 1.x versions), so even if one never dealt with unsorted sections, then the only advantage of using version 1.3 would be its smaller size. One of the disadvantages of version 1.3 is that it contains a small bug -- due to one (!) erroneous byte, it does not allow the INPUT file to have a name of length 1. UU always allows the source file to contain more than one uuencoded file, and each of these files may consist of any number of sections. If no switches are used, then these sections MUST be in the correct order. So in this case, a file containing the following sections: <file 1 part 1> <file 1 part 2 (last part)> <file 2 part 1> <file 2 part 2> <file 2 part 3 (last part)> will be handled correctly by UU (and result in two files), whereas <file 1 part 1> <file 2 part 1> <file 1 part 2> <file 2 part 2> <file 2 part 3> and <file 1 part 2> <file 1 part 1> <file 2 part 1> <file 2 part 3> <file 2 part 2> will not. Again, this restriction does NOT apply when UU is told that the file may contain unsorted sections. When used in the "sorted order" mode of operation, UU can handle any number of sections contained in one input file; there is no limit. The only thing that may happen (apart from your hard disk getting full), is that some of the numbers that UU displays will not be correct, but this only happens if the number of sections in one file exceeds 9999. (Yes, I know I used the number 65535 in a previous manual, but that was a mistake. That is what happens when you socialise with computers too much.) If the program terminates or aborts after having detected some error, an ERRORLEVEL of 1 is returned; a successful termination results in ERRORLEVEL 0. Some platforms do not have the restriction of filenames being only at most 8+3 characters long, so the filename in the header of the first section of an encoded file may not be DOS-compliant. UU recognises this, and prompts the user for a new filename. If the filename for an encoded file already exists, the user is informed of this, and may then choose to either overwrite the old file, or rename the new one. At this point, CTRL-Break (and CTRL-C) may be used to abort the process. As opposed to some other uudecoders, UU does not choke on CTRL-Z characters. UU ignores lines that are not uuencoded, typically before and after sections. I saw somewhere that a uudecoder written by someone else could be notified that (for example) "---" is not a decodable line, as it seems that this line is used as a cut line on several BBS systems. With UU, it is not possible to designate such a non-decodable line ... merely because UU does not need that information to determine that a given line is not to be treated as a uuencoded line. UU uses four ways to determine whether a line is a mere comment or not, and treats the line as an encoded line only if all four ways show it is not a comment. These tests are partly performed simultaneously, and always in such a way as to require hardly any additional time (e.g. when the data required for a test is available due to some other action currently being performed). Although UU is quite intelligent, it is possible to fool it, but I think that this is purely academic, for the chances of it being fooled are astronomically small (unless someone intentionally fooled UU). Even if one decoded hundreds of thousands of uuencoded files, it would most probably occur not even once that UU was fooled. And if it should ever occur that UU is fooled, then, please, do not blame UU or me, but blame the one who invented the uuencoding standard for not making it more strict. Or, put in another way: All uudecoders can be fooled, but mine must be one of the most reliable ones as I can easily show by a simple computation of probabilities. Of course, UU cannot perform miracles, so if the uuencoded file is corrupt to begin with, UU will be helpless too. Handling unsorted sections ~~~~~~~~~~~~~~~~~~~~~~~~~~ UU can also handle files containing randomly ordered sections. For this mode of operation, two switches are available: /I and /S. When invoked with /I only, UU will scan the source file, and it will subsequently report what it has found there, but it will not actually decode anything. When invoked with both /I and /S (or any equivalent notation -- see the chapter on command line parameters and switches), it WILL start decoding after having reported the information. A less verbose, but equally efficient result is obtained by specifying only the /S switch. Although there is a maximum to the number of sections that UU can handle using this "unsorted sections" mode of operation, this can hardly be considered a restriction, since this maximum number is 434. This mode of operation, however still very fast, is slower than the "sorted order" mode. Just how much slower depends on the order in which the sections appear. Worst case performance (in terms of speed) is when the sections appear in reversed order; considerable gains may be achieved on systems using disk caches and/or RAM drives. Since the "sorted order" mode uses one very powerful assumption (viz. the sections being in sorted order), whereas the "unsorted sections" mode can (at best) only rely on whatever information it filters out of the source file, it is possible for UU to obtain better results in the former mode. So I recommend using the "sorted order" mode whenever one is sure that every section appears in the correct order (which, as noted earlier, also is faster). So how does UU obtain its information? The current version of UU recognises more than fifteen different uuencoders and posting programs. (For the ease of discussion, I shall use the term "uuencoders" when I mean "uuencoders and/or posting programs" in the remainder of this manual.) As far as I know, these mostly are uuencoders used on PCs and UNIX systems, but I'd rather wait with listing the uuencoders it recognises until I have found out which ones most of them are. If it cannot recognise the uuencoders that were used, or if these have not included all of the necessary information in the file, UU tries to use the "Subject:" lines (if it finds any) that may be included if the file contains postings from USENET. Instead of "Subject:" lines, some newsreaders produce "Description:" lines; these are also supported by UU. In the remainder of this manual, I shall no longer refer to "Description:" lines, but whatever holds for "Subject:" lines, also applies to "Description:" lines. If postings from USENET are used, I recommend NOT chopping off the headers (and thus the "Subject:" lines) for a higher chance of success. "Subject:" lines are used only if all else fails, because of the higher chance of these containing errors. For instance, someone may have erroneously given a five part file a subject line of "EXAMPLE.ZIP (4/6)" indicating that there are six parts. But even when things like this happen, there is a good chance that UU will successfully decode these files all the same. To end this subject (no pun intended), some examples of "Subject:" lines, and how they will be processed by UU: - Subject: EXAMPLE.ZIP (4/6) UU sees this as part four of a six part file called EXAMPLE.ZIP. - Subject: PICTURE.GIF {Just another picture} [01/10] As expected, UU will see this as part one of a ten part file called PICTURE.GIF. - Subject: Repost:AGAIN.EXE(Part3of20).Reposted on popular demand. Yes, UU will assume it is dealing with part three of a twenty part file called AGAIN.EXE. - Subject: >FOOBAR.JPG (b/w) {Another picture} (part 3/5. UU is not fooled by "(b/w)", nor by the ">"; it will correctly assume this is part three of a five part file called FOOBAR.JPG. - Subject: - FooBar.Jpg {Another picture /0 } part04 of5} (6 /w ). Even this does not fool UU; it assumes to be dealing with part four of a five part file called FooBar.Jpg. Moreover, UU will see this as a further part of the same file as in the previous example. Although these examples show that UU is quite "intelligent" while dealing with these lines, I realise that my "Subject:" line parser still leaves room for improvement. Either way, the name it finds in the "Subject:" line is not all that important since the name of the file also appears in the header of the first section of a uuencoded file. And most of the time (so even when it comes up with false information from the "Subject:" line), it will yield a correct result anyway. And while on the subject of filenames: Most of the uuencoders also include the filename at the start of each (so not only the first) section, one way or another. For at least some of them, it may be the case that this name differs from the one that is in the header of the first section. And of course, this is also possible for the name UU filters out of the "Subject:" line. That is why, when using the /I switch, UU will give two names for each section it finds. The real name (i.e. the one from the header of the first section) is the one that is NOT parenthesised. And although UU will display the names exactly as they appear in the file, it will perform a case-insensitive comparison between these names, thus making up for capitalisation inconsistencies by the person who posted the file. Also when using the /I switch, UU will give the section number and the total number of sections for each section (as far as this could be determined of course). This is displayed as in "(003/010)", which which would mean that this section is part three of a ten part file. Whenever a number could not be determined, "000" is printed instead. Finally (still when using the /I switch only), UU displays some information on any section it will not be able to process, as well as the reason for this. The remainder of this chapter holds for both the /I and /S switches: Whenever a filename that was encountered is longer than twelve characters, it will be displayed to the first eleven characters only, with an asterisk (*) appended to it. Of course, the full name will be displayed when prompting the user for a new filename. When UU has scanned the input file, it will list the names, and numbers of sections of each COMPLETE file it has found. It also gives the total number of sections it has found, the number of sections it could not identify, and the number of sections that may be processed. Note that the latter number is not necessarily the difference of the former two, because there are various reasons that a section that WAS identified cannot be processed after all (for example when there are other sections of the same file missing). The actual reason will usually be given while using the /I switch. I have done my very best to make UU as smart as possible, but as noted earlier, due to the fact that the uuencoding standard is not strict enough, even the most intelligent uudecoder may not be able to correctly figure everything out. Let me end this chapter by quoting Nick Viner: "Of course some files which have been split by hand and not labelled adequately will always defeat it!" Plans for future versions of UU ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ With the exception of the first two points, the following plans do not have high priority for me ... but I am open to suggestions, so if you have any arguments in favour of any of these, or perhaps some new suggestions, please let me know. I can think of several ways of making UU even smarter. For instance, by adding support for even more uuencoders (if I find any). Another option is to have UU use the information it gathers but does not use so far, so as to have it make its own assumptions about sections that could only be partially identified. The latter case would then be as if UU said "These sections probably belong together ... well, let's assume they do, and process them.". Finally, the routine that deals with USENET's "Subject:" lines could be made yet a little smarter. Another option I plan to add, is to have UU be able to write every section that has not been processed to a separate file. Related to this would be an option to have UU output all non-encoded data. I am considering having UU be able to handle files whose sections are not all contained in one and the same file, so the PART1.UUE PART2.UUE PART3.UUE scheme, but I should add that this does not have high priority, since I only need this very rarely, and for these rare cases, I do not mind using the COPY command first in order to put everything in one file. As an alternative to the former, or even in addition to it, I may some day have UU accept wildcards in the filename. I am considering adding a switch (/d for instance) allowing one to have the input file deleted after it has been SUCCESSFULLY processed. Again, this does not have high priority for me, but on the other hand, it would be very easy to add this. So anyone in favour of this is kindly requested to react. (People who are against this option do not have to react I guess, because no one is forced to actually use all UU's options.) Some uuencoders put checksums in the files. I may have a future version of UU be able to check these. I may also write an also very fast, and even smaller uuencoder. I may add a third option to UU in case a file already exists, viz. "skip", which will allow the user to choose not to process this file, and continue with the next (if any). I may also add support for xxencoded files to UU. Someone suggested it would be nice if one could change UU's defaults, so that, for example, the /S switch would then be assumed automatically. I do not like to do this, since it would make using UU less easy. I think that naive users would be frightened by the prospect of having to edit some configuration file (or something like that) first. Moreover, I think typing "UU/S" instead of "UU" cannot be a real bother. Or stated differently: If I had given this program a longer name, then those extra characters would have to be entered anyway. Acknowledgements ~~~~~~~~~~~~~~~~ I should like to thank the following persons: - Terry O'Brien for sending me detailed information on the file mode code in the header of uuencoded files, and on uuencoding in general. - Martin (sorry, don't know your last name) from Nottingham (?) for telling me about the bug :-( in version 1.1 (and 1.0). - Brian Norris for telling me about the bug :-( in version 1.3 (and earlier versions). - Douglas Swiggum for all the trouble taken in sending me "strange" uuencoded files, and detailed descriptions of what happened. You have saved me a lot of time in finding two bugs :-( in version 2.0! Last but not least, I should like to thank all the people who have let me know they appreciate my program, or otherwise (e.g. by telling me about bugs) mailed me regarding UU. Release history ~~~~~~~~~~~~~~~ In my convention of version numbers, 0.x versions denote usually unreleased prototype versions. Versions 0.1 through 0.4, and 0.6 were private, unreleased versions, written in a mixture of Pascal and Assembly-language. Version 0.5 was given to but a few people to see how they liked it. It had resulted from a process of stepwise refinement in which speed, size, feedback, and user-friendliness were tackled. Versions 0.1 through 0.5 were all written on 11-Dec-93. They were EXE files, and the latter had a size of 5872 bytes. UU 0.6 Type: EXE Size: 3424 Date: 14-Dec-93 The last prototype version. Most of it written in assembly. Yet a bit faster than 0.5. UU 1.0 Type: COM Size: 1993 Date: 15-Dec-93 The first publicly released version. But for some tiny details this is the full-assembly version of 0.6. UU 1.1 Type: COM Size: 1965 Date: 18-Dec-93 Even smarter in distinguishing comment lines from encoded lines (a fourth test has been added). Sections containing only one non-empty line are now recognised as such. Detects when the disk is full, upon which it aborts with an appropriate message. Yet a bit faster than 1.0. UU 1.2 Type: COM Size: 1896 Date: 23-Dec-93 Now really only accepts "y", "Y", "n", and "N" while asking permission to overwrite an existing file. Also, CTRL-Break (and CTRL-C) can be used at this point to abort the program immediately. UU 1.3 Type: COM Size: 1892 Date: 25-Dec-93 In earlier versions, lines of more than 255 characters COULD (although it is HIGHLY improbable they actually WOULD) result in decoded files being corrupted; starting with this version, this can no longer happen. Yet a bit faster than 1.2 (amongst others (but not only!) because the read and write buffers now each are 4k larger). UU 2.0 Type: COM Size: 5866 Date: 09-Jan-94 Now also allows files containing unsorted sections. An intelligent command line parser has been added. Because of this, the bug of UU not accepting filenames of length 1 in the command line (in fact, I did not even know about this bug until some time after I had finished the parsing routines) no longer exists. Aborts with an appropriate message if there is not enough (conventional) RAM free. Displays an error message if invoked without any parameters or switches. UU 2.1 Type: COM Size: 6257 Date: 17-Jan-94 I really thought I had solved the problem of lines containing more than 255 characters in version 1.3, but I had not; now, it is REALLY fixed. Added support for five more uuencoders and posting programs, as well as for "Description:" lines. Made the parser for "Subject:" (and "Description:") lines even more intelligent. Fixed a bug that seemed to matter only when run from the DOS box under Windows. The maximum number of unsorted sections UU can handle is slightly higher. Some minor changes not worth mentioning. Contacting the author <-- Hey, that's me! :-) ~~~~~~~~~~~~~~~~~~~~~ Contact me (preferably using e-mail) if you have any questions, suggestions, remarks, etc., on this document, on UU, or on any other of my programs. Also, if you find a valid uuencoded file that UU does not process correctly, please let me know. And if at all possible, pray send that file along to me (or otherwise a detailed description of its contents), preferably in some (any) compressed form in order to keep my mail server from automagically ruining it. Beyond my control, my mail server automatically decodes (or tries to anyway) uuencoded files, so I would not end up with your uuencoded file. Thank you very much! I check the alt.binaries.pictures.misc and alt.binaries.pictures.utilities newsgroups on USENET regularly, so you could also try placing messages for me there. Finally, please send me an e-mail if you think my program is of use to you (or flame me if you think it is useless). If I do not get enough feedback, I take it that people are not interested, and I shall ... continue writing programs for myself, but DIScontinue spreading them on anything but a very small scale. Ben Jos Walbeehm (Please get my first name right, it is "Ben Jos".) Lijsterbeslaan 20 5248 BB Rosmalen The Netherlands Phone : +31 4192 14345 (The best time (GMT) to get hold of me is at night!) E-mail: Walbeehm@fsw.ruu.nl